deployment strategies

2022-03-25 ยท 2 min read

Blog: https://harness.io/blog/continuous-verification/blue-green-canary-deployment-strategies/

Outage Deploy #

  1. Shut down all v1 nodes
  2. Deploy/provision v2 nodes

Pros:

  • Dumbest and simplest approach

Cons:

  • Risky
  • Mandatory outage period
  • Extended outage if there's a bug in v2 nodes that needs to rollback

Concurrent Deploy #

  1. Concurrently reprovision all v1 nodes to v2 nodes

Pros:

  • Dumb and simple
  • (Hopefully) no downtime : )

Cons:

  • Risky
  • Multiple versions exist at the same time for a short period (potential fun bugs).
  • Service API changes must be backwards compatible.
  • Extended outage if there's a bug in v2 nodes that needs to rollback

Rolling Deploy #

  1. Incrementally roll out new version to $n$ nodes, then $2n$ nodes, ..., until all $N$ nodes updated.
    1. At each step, you verify whether the new deployment is healthy. If not, you rollback
  2. If you need a rollback, then you just take the subset of newer version nodes and reprovision them with the older version.

Pros:

  • Less-risky than previous strategies
  • No downtime

Cons:

  • Multiple versions exist at the same time for a short period (potential fun bugs).
  • Service API changes must be backwards compatible.
  • Need well-automated deployment verification

Blue-Green Deploy #

Blog: https://martinfowler.com/bliki/BlueGreenDeployment.html

  1. Uses two (ideally) identical environments: (1) "staging" (blue) and (2) "production" (green).
  2. Rollout upgrade to staging.
  3. Check that staging metrics are good.
  4. Then switch over user traffic from production to staging. Then the environment labels swap, so previously staging -> production and previously production -> staging.
  5. If you need to rollback (after switching and before next deploy), you can just switch back to the (now staging) previous production environment.

Pros:

  • Relatively simple and fast
  • Low risk
  • No downtime

Cons:

  • Double the cost, since you need TWO production-ready environments
  • Shifting traffic can have subtle race conditions or lost transactions.

Canary Deploy #

  1. Rollout deploy to subsets of production users / traffic.
  2. For example, deploy to 1% US users, 50% US users, 100% US users, then 100% Global users.

Pros:

  • Test code in production with real users and real traffic.
  • Cheaper than Blue-Green
  • Low risk
  • Fast and easy rollback
  • No downtime

Cons:

  • Testing in production
  • High scripting complexity
  • Service API changes must be backwards compatible.
  • Need well-automated deployment verification

A/B Testing #